AITopics

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.45)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Game Theory (0.68)

Neural Information Processing SystemsFeb-11-2026, 06:55:49 GMT

aa933b5abc1be30baece1d230ec575a7-Supplemental-Conference.pdf

baseline, chemcp, gene 0, (15 more...)

Genre: Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Wu, Jinran, Wang, You-Gan, McLachlan, Geoffrey J.

Informative missingness and its implications in semi-supervised learning

arXiv.org Machine LearningDec-5-2025

Semi-supervised learning (SSL) constructs classifiers using both labelled and unlabelled data. It leverages information from labelled samples, whose acquisition is often costly or labour-intensive, together with unlabelled data to enhance prediction performance. This defines an incomplete-data problem, which statistically can be formulated within the likelihood framework for finite mixture models that can be fitted using the expectation-maximisation (EM) algorithm. Ideally, one would prefer a completely labelled sample, as one would anticipate that a labelled observation provides more information than an unlabelled one. However, when the mechanism governing label absence depends on the observed features or the class labels or both, the missingness indicators themselves contain useful information. In certain situations, the information gained from modelling the missing-label mechanism can even outweigh the loss due to missing labels, yielding a classifier with a smaller expected error than one based on a completely labelled sample analysed. This improvement arises particularly when class overlap is moderate, labelled data are sparse, and the missingness is informative. Modelling such informative missingness thus offers a coherent statistical framework that unifies likelihood-based inference with the behaviour of empirical SSL methods.

information, mechanism, missingness, (15 more...)

arXiv.org Machine Learning

2512.04392

Country:

Oceania > Australia > Queensland (0.04)
Asia > China > Hunan Province (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Julien Audiffren, Liva Ralaivola

Bandits Dueling on Partially Ordered Sets

Neural Information Processing SystemsNov-21-2025, 11:57:00 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, pareto front, (17 more...)

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Industry:

Media > Film (0.68)
Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceOct-31-2025

Completion $\neq$ Collaboration: Scaling Collaborative Effort with Agents

Shen, Shannon Zejiang, Chen, Valerie, Gu, Ken, Ross, Alexis, Ma, Zixian, Ross, Jillian, Gu, Alex, Si, Chenglei, Chi, Wayne, Peng, Andi, Shen, Jocelyn J, Talwalkar, Ameet, Wu, Tongshuang, Sontag, David

Current evaluations of agents remain centered around one-shot task completion, failing to account for the inherently iterative and collaborative nature of many real-world problems, where human goals are often underspecified and evolve. We argue for a shift from building and assessing task completion agents to developing collaborative agents, assessed not only by the quality of their final outputs but by how well they engage with and enhance human effort throughout the problem-solving process. To support this shift, we introduce collaborative effort scaling, a framework that captures how an agent's utility grows with increasing user involvement. Through case studies and simulated evaluations, we show that state-of-the-art agents often underperform in multi-turn, real-world scenarios, revealing a missing ingredient in agent design: the ability to sustain engagement and scaffold user understanding. Collaborative effort scaling offers a lens for diagnosing agent behavior and guiding development toward more effective interactions.

large language model, machine learning, natural language, (17 more...)

2510.25744

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.68)

Industry:

Banking & Finance (1.00)
Education (0.93)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

arXiv.org Artificial IntelligenceOct-29-2025

Accelerate Scaling of LLM Finetuning via Quantifying the Coverage and Depth of Instruction Set

Wu, Chengwei, Du, Li, Zhao, Hanyu, Ju, Yiming, Wang, Jiapu, Chen, Tianyu, Zhou, Haoyi

Scaling the amount of data used for supervied fine-tuning(SFT) does not guarantee the proportional gains in model performance, highlighting a critical need to understand what makes training samples effective. This work identifies two fundamental dataset properties that govern SFT scalability: \textbf{semantic coverage}, or the breadth of task domains, and \textbf{information depth}, or the richness of individual examples. We demonstrate that simple proxies for these properties explain the majority of validation loss variance in our experiments. In this work, we further propose the \textbf{Information Landscape Approximation (ILA)}, a model-agnostic data selection framework that jointly optimizes for these two factors. ILA constructs compact subsets that approximate the informational value of large datasets. Empirical results show that models tuned on ILA-selected data achieve faster and more sustained performance improvements across diverse tasks and model sizes compared to existing methods, a phenomenon we term \textbf{accelerated scaling}.

large language model, machine learning, natural language, (18 more...)

2509.06463

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ruangtanusak, Saksorn, Taveekitworachai, Pittawat, Pipatanakul, Kunat

Talk Less, Call Right: Enhancing Role-Play LLM Agents with Automatic Prompt Optimization and Role Prompting

arXiv.org Artificial IntelligenceOct-14-2025

This report investigates approaches for prompting a tool-augmented large language model (LLM) to act as a role-playing dialogue agent in the API track of the Commonsense Persona-grounded Dialogue Challenge (CPDC) 2025. In this setting, dialogue agents often produce overly long in-character responses (over-speaking) while failing to use tools effectively according to the persona (under-acting), such as generating function calls that do not exist or making unnecessary tool calls before answering. We explore four prompting approaches to address these issues: 1) basic role prompting, 2) improved role prompting, 3) automatic prompt optimization (APO), and 4) rule-based role prompting. The rule-based role prompting (RRP) approach achieved the best performance through two novel techniques-character-card/scene-contract design and strict enforcement of function calling-which led to an overall score of 0.571, improving on the zero-shot baseline score of 0.519. These findings demonstrate that RRP design can substantially improve the effectiveness and reliability of role-playing dialogue agents compared with more elaborate methods such as APO. To support future efforts in developing persona prompts, we are open-sourcing all of our best-performing prompts and the APO tool Source code is available at https://github.com/scb-10x/apo

artificial intelligence, large language model, natural language, (13 more...)

2509.00482

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Neural Information Processing SystemsOct-8-2025, 23:32:58 GMT

7e0af0d1bc0ec2a90fc294be2e00447e-Paper-Conference.pdf

artificial intelligence, data mining, machine learning, (21 more...)

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.45)

Industry: Information Technology > Services (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.68)
Information Technology > Game Theory (0.68)

Ebi, Daniel, Lambrechts, Gaspard, Ernst, Damien, Böhm, Klemens

Informed Asymmetric Actor-Critic: Leveraging Privileged Signals Beyond Full-State Access

arXiv.org Machine LearningOct-1-2025

Reinforcement learning in partially observable environments requires agents to act under uncertainty from noisy, incomplete observations. Asymmetric actor-critic methods leverage privileged information during training to improve learning under these conditions. However, existing approaches typically assume full-state access during training. In this work, we challenge this assumption by proposing a novel actor-critic framework, called informed asymmetric actor-critic, that enables conditioning the critic on arbitrary privileged signals without requiring access to the full state. We show that policy gradients remain unbiased under this formulation, extending the theoretical foundation of asymmetric methods to the more general case of privileged partial information. To quantify the impact of such signals, we propose informativeness measures based on kernel methods and return prediction error, providing practical tools for evaluating training-time signals. We validate our approach empirically on benchmark navigation tasks and synthetic partially observable environments, showing that our informed asymmetric method improves learning efficiency and value estimation when informative privileged inputs are available. Our findings challenge the necessity of full-state access and open new directions for designing asymmetric reinforcement learning methods that are both practical and theoretically sound.

agent, information, policy gradient, (15 more...)

arXiv.org Machine Learning

2509.26

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Europe > Belgium > Wallonia > Liège Province > Liège (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

arXiv.org Artificial IntelligenceSep-26-2025

Instruction-tuned Self-Questioning Framework for Multimodal Reasoning

Jang, You-Won, Heo, Yu-Jung, Kim, Jaeseok, Lee, Minsu, Chang, Du-Seong, Zhang, Byoung-Tak

The field of vision-language understanding has been actively researched in recent years, thanks to the development of Large Language Models~(LLMs). However, it still needs help with problems requiring multi-step reasoning, even for very simple questions. Recent studies adopt LLMs to tackle this problem by iteratively generating sub-questions and answers. However, there are disadvantages such as 1) the fine-grained visual contents of images are not available using LLMs that cannot read visual information, 2) internal mechanisms are inaccessible and difficult to reproduce by using black-box LLMs. To solve these problems, we propose the SQ (Self-Questioning)-InstructBLIP, which improves inference performance by generating image-aware informative sub-questions and sub-answers iteratively. The SQ-InstructBLIP, which consists of a Questioner, Answerer, and Reasoner that share the same architecture. Questioner and Answerer generate sub-questions and sub-answers to help infer the main-question, and Reasoner performs reasoning on the main-question considering the generated sub-question information. Our experiments show that the proposed method SQ-InstructBLIP, which uses the generated sub-questions as additional information when solving the VQA task, performs more accurate reasoning than the previous works.

artificial intelligence, large language model, natural language, (17 more...)

2509.21251

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)